Truncated Pareto Distribution (truncpareto) — Bounded Power Laws#

The truncated Pareto is a power-law distribution with an upper cutoff. It’s useful when values are heavy-tailed (many small, a few very large) but a true “infinite tail” is impossible due to physics, policy, or measurement limits.

What you’ll learn#

  • classification, support, and parameter constraints

  • the PDF/CDF/PPF in closed form (and why truncation matters)

  • moments (mean/variance/skewness/kurtosis), characteristic function, and entropy

  • parameter intuition: how b (tail exponent) and c (upper cutoff) change the shape

  • a NumPy-only sampler via inverse CDF + Monte Carlo validation

  • practical usage via scipy.stats.truncpareto (pdf, cdf, rvs, fit)

import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio

from scipy import optimize, stats

# Plotly rendering (CKC convention)
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")

# Reproducibility
rng = np.random.default_rng(7)

np.set_printoptions(precision=4, suppress=True)

1) Title & Classification#

  • Name: truncpareto (upper truncated Pareto distribution)

  • Type: Continuous

  • Support (standardized): \(x \in [1, c]\)

  • Parameter space (standardized): \(b > 0\), \(c > 1\)

SciPy also supports a shift/scale: if \(Y = \mathrm{loc} + \mathrm{scale}\,X\), then

\[ \mathrm{support}(Y) = [\mathrm{loc} + \mathrm{scale},\; \mathrm{loc} + c\,\mathrm{scale}], \qquad \mathrm{scale} > 0. \]

We write:

\[ X \sim \mathrm{TruncPareto}(b, c). \]

2) Intuition & Motivation#

2.1 What it models#

A Pareto/power-law model says “small values are common, large values are rare, and there’s no typical scale.” Real systems often violate the infinite tail assumption: there is usually a maximum feasible value.

truncpareto captures this by taking a Pareto-like density on \([1, \infty)\) and conditioning on \(X \le c\):

\[ X_{\text{trunc}} \;\overset{d}{=}\; (X\mid X \le c), \qquad X\sim \mathrm{Pareto}(b). \]

2.2 Typical real-world use cases#

  • Earthquake energy / event sizes: power-law behavior with a physical upper cutoff.

  • Wildfire sizes: heavy-tailed, but bounded by geography and time.

  • File sizes / network flows: heavy tails with protocol or system caps.

  • Wealth / claims with policy caps: heavy-tailed outcomes with maximum payout.

2.3 Relations to other distributions#

  • Pareto: as \(c \to \infty\), the truncation vanishes and truncpareto approaches a standard Pareto.

  • Log-uniform: as \(b \to 0^+\), the density approaches \(f(x) \propto 1/x\) on \([1,c]\) (uniform in \(\log x\)).

  • Alternatives: lognormal, Weibull, and exponential families can mimic heavy tails over finite ranges; model comparison is often empirical.

3) Formal Definition#

Let \(b>0\) and \(c>1\). Define the normalization constant

\[ k(b,c) = \frac{b}{1 - c^{-b}}. \]

3.1 PDF#

\[\begin{split} f(x\mid b,c)= \begin{cases} k(b,c)\,x^{-(b+1)}, & 1 \le x \le c,\\ 0, & \text{otherwise.} \end{cases} \end{split}\]

3.2 CDF#

\[\begin{split} F(x\mid b,c)= \begin{cases} 0, & x < 1,\\ \dfrac{1 - x^{-b}}{1 - c^{-b}}, & 1 \le x \le c,\\ 1, & x \ge c. \end{cases} \end{split}\]

3.3 Quantile function (PPF)#

Solving \(F(x)=u\) gives

\[ F^{-1}(u) = \left(1 - u(1 - c^{-b})\right)^{-1/b}, \qquad u\in[0,1]. \]
def _check_truncpareto_params(b: float, c: float) -> tuple[float, float]:
    b = float(b)
    c = float(c)
    if not (b > 0):
        raise ValueError("b must be > 0")
    if not (c > 1):
        raise ValueError("c must be > 1")
    return b, c


def truncpareto_logpdf(x: np.ndarray, b: float, c: float) -> np.ndarray:
    b, c = _check_truncpareto_params(b, c)
    x = np.asarray(x, dtype=float)

    logc = np.log(c)
    den = -np.expm1(-b * logc)  # 1 - c^{-b}
    log_k = np.log(b) - np.log(den)

    out = np.full_like(x, -np.inf, dtype=float)
    mask = (x >= 1) & (x <= c)
    out[mask] = log_k - (b + 1) * np.log(x[mask])
    return out


def truncpareto_pdf(x: np.ndarray, b: float, c: float) -> np.ndarray:
    return np.exp(truncpareto_logpdf(x, b, c))


def truncpareto_cdf(x: np.ndarray, b: float, c: float) -> np.ndarray:
    b, c = _check_truncpareto_params(b, c)
    x = np.asarray(x, dtype=float)

    logc = np.log(c)
    den = -np.expm1(-b * logc)  # 1 - c^{-b}

    out = np.zeros_like(x, dtype=float)
    out[x >= c] = 1.0

    mask = (x >= 1) & (x < c)
    if np.any(mask):
        num = -np.expm1(-b * np.log(x[mask]))  # 1 - x^{-b}
        out[mask] = num / den

    return out


def truncpareto_ppf(u: np.ndarray, b: float, c: float) -> np.ndarray:
    b, c = _check_truncpareto_params(b, c)
    u = np.asarray(u, dtype=float)
    if np.any((u < 0) | (u > 1)):
        raise ValueError("u must be in [0, 1]")

    c_neg_b = np.exp(-b * np.log(c))  # c^{-b}
    x_neg_b = 1.0 - u * (1.0 - c_neg_b)

    x = x_neg_b ** (-1.0 / b)
    return np.clip(x, 1.0, c)

4) Moments & Properties#

Because the support is bounded (\(1\le X\le c\)), all moments exist.

4.1 Raw moments#

For any real \(r\):

\[ \mathbb{E}[X^r] = \frac{b}{1-c^{-b}}\int_1^c x^{r-b-1}\,dx. \]

This gives the closed form

\[\begin{split} \mathbb{E}[X^r] = \begin{cases} \dfrac{b}{1-c^{-b}}\,\dfrac{c^{r-b}-1}{r-b}, & r\ne b,\\ \dfrac{b}{1-c^{-b}}\,\log c, & r=b. \end{cases} \end{split}\]

In particular, the mean is \(\mathbb{E}[X]=\mathbb{E}[X^1]\) and the second raw moment is \(\mathbb{E}[X^2]\). The variance is \(\mathrm{Var}(X)=\mathbb{E}[X^2]-\mathbb{E}[X]^2\).

4.2 Skewness and kurtosis#

Using raw moments \(m_r=\mathbb{E}[X^r]\):

\[ \mu_3 = \mathbb{E}[(X-\mu)^3] = m_3 - 3\mu m_2 + 2\mu^3, \qquad \mu_4 = m_4 - 4\mu m_3 + 6\mu^2 m_2 - 3\mu^4. \]

Then

\[ \gamma_1 = \frac{\mu_3}{\sigma^3}, \qquad \text{excess kurtosis} = \frac{\mu_4}{\sigma^4} - 3. \]

4.3 MGF / characteristic function#

Since \(X\) is bounded, the MGF exists for all real \(t\):

\[ M_X(t) = \mathbb{E}[e^{tX}] = \frac{b}{1-c^{-b}}\int_1^c x^{-(b+1)}e^{tx}\,dx. \]

It can be written in terms of the upper incomplete gamma function (for \(t\ne 0\)):

\[ M_X(t)=\frac{b}{1-c^{-b}}(-t)^b\bigl[\Gamma(-b,-t) - \Gamma(-b,-tc)\bigr]. \]

The characteristic function is \(\varphi_X(\omega)=M_X(i\omega)\).

4.4 Entropy#

The differential entropy has a closed form:

\[ h(X) = -\log\left(\frac{b}{1-c^{-b}}\right) + (b+1)\left(\frac{1}{b} - \frac{c^{-b}\log c}{1-c^{-b}}\right). \]

(A useful numerically stable identity is \(\dfrac{c^{-b}}{1-c^{-b}} = \dfrac{1}{c^b-1}\).)

def truncpareto_raw_moment(order: float, b: float, c: float) -> float:
    b, c = _check_truncpareto_params(b, c)
    r = float(order)

    logc = np.log(c)
    den = -np.expm1(-b * logc)  # 1 - c^{-b}
    k = b / den

    if np.isclose(r, b):
        return k * logc

    num = np.expm1((r - b) * logc)  # c^{r-b} - 1
    return k * num / (r - b)


def truncpareto_moments(b: float, c: float) -> dict:
    m1 = truncpareto_raw_moment(1.0, b, c)
    m2 = truncpareto_raw_moment(2.0, b, c)
    m3 = truncpareto_raw_moment(3.0, b, c)
    m4 = truncpareto_raw_moment(4.0, b, c)

    var = m2 - m1**2
    std = np.sqrt(var)

    mu3 = m3 - 3 * m1 * m2 + 2 * m1**3
    mu4 = m4 - 4 * m1 * m3 + 6 * (m1**2) * m2 - 3 * m1**4

    skew = mu3 / (std**3)
    excess_kurt = mu4 / (var**2) - 3

    return {
        "mean": m1,
        "var": var,
        "skew": skew,
        "excess_kurt": excess_kurt,
        "mode": 1.0,
        "median": float(truncpareto_ppf(0.5, b, c)),
    }


def truncpareto_entropy(b: float, c: float) -> float:
    b, c = _check_truncpareto_params(b, c)

    logc = np.log(c)
    den = -np.expm1(-b * logc)  # 1 - c^{-b}
    log_k = np.log(b) - np.log(den)

    # E[log X] = 1/b - log(c)/(c^b - 1)
    elogx = (1.0 / b) - (logc / np.expm1(b * logc))

    return -log_k + (b + 1) * elogx


# Quick numerical check against SciPy
b0, c0 = 2.0, 5.0
ours = truncpareto_moments(b0, c0)
mean_s, var_s, skew_s, exkurt_s = stats.truncpareto.stats(b0, c0, moments="mvsk")

{
    "mean (ours, scipy)": (ours["mean"], float(mean_s)),
    "var  (ours, scipy)": (ours["var"], float(var_s)),
    "skew (ours, scipy)": (ours["skew"], float(skew_s)),
    "excess_kurt (ours, scipy)": (ours["excess_kurt"], float(exkurt_s)),
    "entropy (ours, scipy)": (truncpareto_entropy(b0, c0), float(stats.truncpareto.entropy(b0, c0))),
}
{'mean (ours, scipy)': (1.666666666666667, 1.6666666666666667),
 'var  (ours, scipy)': (0.5752178731265971, 0.5752178731265976),
 'skew (ours, scipy)': (1.897052928779272, 1.8970529287792715),
 'excess_kurt (ours, scipy)': (3.587240444439618, 3.5872404444395976),
 'entropy (ours, scipy)': (0.5648510858655371, 0.564851085865537)}

5) Parameter Interpretation#

truncpareto has two standardized parameters:

  • \(b\) (shape / tail exponent):

    • Appears in \(x^{-(b+1)}\).

    • Larger \(b\) puts more mass near 1 (lighter tail).

    • Smaller \(b\) produces a heavier tail (more probability near \(c\)).

    • Limit: \(b \to 0^+\) gives \(f(x) \to \dfrac{1}{x\log c}\) (log-uniform on \([1,c]\)).

  • \(c\) (upper cutoff):

    • The maximum possible value (standardized).

    • Larger \(c\) increases the range and typically increases mean/variance.

    • Limit: \(c\to\infty\) recovers the classic Pareto on \([1,\infty)\).

Scaling to arbitrary bounds#

If you want support \([x_{\min}, x_{\max}]\) with \(0<x_{\min}<x_{\max}\), set

\[ Y = x_{\min}X,\qquad X\sim \mathrm{TruncPareto}(b, c=x_{\max}/x_{\min}). \]

In SciPy this corresponds to scale=x_min, c=x_max/x_min (and typically loc=0).

# How b changes the shape (fixed c)
c_fixed = 10.0
x = np.linspace(1.0, c_fixed, 800)

b_values = [0.3, 1.0, 2.0, 6.0]

fig = go.Figure()
for b in b_values:
    fig.add_trace(
        go.Scatter(x=x, y=truncpareto_pdf(x, b, c_fixed), mode="lines", name=f"b={b:g}")
    )

fig.update_layout(
    title=f"truncpareto PDF: varying b (c={c_fixed:g})",
    xaxis_title="x",
    yaxis_title="density",
    width=900,
    height=420,
)
fig.show()


# How c changes the shape (fixed b)
b_fixed = 2.0
c_values = [1.5, 2.0, 5.0, 20.0]
x = np.linspace(1.0, max(c_values), 900)

fig = go.Figure()
for c in c_values:
    mask = x <= c
    fig.add_trace(
        go.Scatter(
            x=x[mask],
            y=truncpareto_pdf(x[mask], b_fixed, c),
            mode="lines",
            name=f"c={c:g}",
        )
    )

fig.update_layout(
    title=f"truncpareto PDF: varying c (b={b_fixed:g})",
    xaxis_title="x",
    yaxis_title="density",
    width=900,
    height=420,
)
fig.show()

6) Derivations#

6.1 Normalization constant#

Start with \(f(x)=Kx^{-(b+1)}\) on \([1,c]\). Enforce total probability 1:

\[ 1 = \int_1^c Kx^{-(b+1)}\,dx = K\left[\frac{1-x^{-b}}{b}\right]_{1}^{c} = K\,\frac{1-c^{-b}}{b}. \]

So

\[ K = \frac{b}{1-c^{-b}}. \]

6.2 Expectation (general power moment)#

For \(r\in\mathbb{R}\),

\[ \mathbb{E}[X^r]=\int_1^c x^r f(x)\,dx = \frac{b}{1-c^{-b}}\int_1^c x^{r-b-1}\,dx. \]

If \(r\ne b\):

\[ \mathbb{E}[X^r]=\frac{b}{1-c^{-b}}\left[\frac{x^{r-b}}{r-b}\right]_1^c = \frac{b}{1-c^{-b}}\,\frac{c^{r-b}-1}{r-b}. \]

If \(r=b\), the integrand becomes \(x^{-1}\) and the integral is \(\log c\).

6.3 Variance#

Using the raw moments \(m_1=\mathbb{E}[X]\) and \(m_2=\mathbb{E}[X^2]\):

\[ \mathrm{Var}(X) = m_2 - m_1^2. \]

6.4 Likelihood (i.i.d. sample)#

For data \(x_1,\dots,x_n\) with \(1\le x_i\le c\):

\[ \mathcal{L}(b,c) = \prod_{i=1}^n \frac{b}{1-c^{-b}}x_i^{-(b+1)}. \]

The log-likelihood is

\[ \ell(b,c)= n\log b - n\log(1-c^{-b}) - (b+1)\sum_{i=1}^n \log x_i. \]

The constraint \(c \ge \max_i x_i\) is part of the model’s support. Because the only dependence on \(c\) is through \(-n\log(1-c^{-b})\) (which decreases as \(c\) increases), the MLE is

\[ \hat{c} = \max_i x_i. \]

With \(c\) fixed/known, the score equation for \(b\) is

\[ \frac{\partial\ell}{\partial b} = \frac{n}{b} - n\frac{\log c}{c^b - 1} - \sum_{i=1}^n \log x_i = 0, \]

which is typically solved numerically.

def truncpareto_loglikelihood(x: np.ndarray, b: float, c: float) -> float:
    x = np.asarray(x, dtype=float)
    b, c = _check_truncpareto_params(b, c)

    if np.any((x < 1) | (x > c)):
        return -np.inf

    n = x.size
    logc = np.log(c)
    den = -np.expm1(-b * logc)  # 1 - c^{-b}
    log_k = np.log(b) - np.log(den)

    return n * log_k - (b + 1) * np.sum(np.log(x))


def truncpareto_mle_c(x: np.ndarray) -> float:
    x = np.asarray(x, dtype=float)
    if np.any(x < 1):
        raise ValueError("For the standardized model, all data must satisfy x >= 1.")
    return float(np.max(x))


def _score_b(b: float, x: np.ndarray, c: float) -> float:
    n = x.size
    sum_logx = np.sum(np.log(x))
    return (n / b) - (n * np.log(c) / (c**b - 1.0)) - sum_logx


def truncpareto_mle_b(x: np.ndarray, c: float) -> float:
    x = np.asarray(x, dtype=float)
    c = float(c)
    if not (c > 1):
        raise ValueError("c must be > 1")
    if np.any((x < 1) | (x > c)):
        raise ValueError("All data must lie in [1, c] for the standardized model.")

    lo = 1e-8
    hi = 10.0
    f_lo = _score_b(lo, x, c)
    f_hi = _score_b(hi, x, c)

    # If the score is already negative at tiny b, the likelihood is decreasing from the boundary.
    if f_lo <= 0:
        return lo

    # Expand the upper bracket until the score becomes negative.
    while (f_hi > 0) and (hi < 1e6):
        hi *= 2
        f_hi = _score_b(hi, x, c)

    if f_hi > 0:
        # Extremely flat / boundary-ish case
        return hi

    return float(optimize.brentq(lambda bb: _score_b(bb, x, c), lo, hi))


# Demo: MLE on synthetic data
b_true, c_true = 2.0, 5.0
x = stats.truncpareto.rvs(b_true, c_true, size=4000, random_state=rng)

c_hat = truncpareto_mle_c(x)  # boundary MLE
b_hat_known_c = truncpareto_mle_b(x, c_true)

{
    "true (b, c)": (b_true, c_true),
    "MLE c_hat (=max x)": c_hat,
    "MLE b_hat (assuming c known)": b_hat_known_c,
}
{'true (b, c)': (2.0, 5.0),
 'MLE c_hat (=max x)': 4.966201346197002,
 'MLE b_hat (assuming c known)': 1e-08}

7) Sampling & Simulation#

The CDF has a closed-form inverse, so inverse transform sampling is straightforward.

Starting from

\[ F(x)=\frac{1-x^{-b}}{1-c^{-b}},\qquad 1\le x\le c, \]

set \(U\sim\mathrm{Uniform}(0,1)\) and solve \(F(X)=U\):

\[ X = \left(1 - U(1 - c^{-b})\right)^{-1/b}. \]

This is efficient, stable, and uses only uniform random numbers.

def truncpareto_rvs_numpy(b: float, c: float, size: int, rng: np.random.Generator) -> np.ndarray:
    b, c = _check_truncpareto_params(b, c)
    u = rng.random(size)
    return truncpareto_ppf(u, b, c)


# Compare NumPy-only sampler to SciPy sampler (quick two-sample KS test)
b0, c0 = 2.0, 5.0
n = 50_000

samples_numpy = truncpareto_rvs_numpy(b0, c0, size=n, rng=rng)
samples_scipy = stats.truncpareto.rvs(b0, c0, size=n, random_state=rng)

stats.ks_2samp(samples_numpy, samples_scipy)
KstestResult(statistic=0.006680000000000019, pvalue=0.21359969753360408, statistic_location=1.5599207153635837, statistic_sign=-1)

8) Visualization#

We’ll visualize:

  • the PDF and CDF for chosen parameters

  • Monte Carlo samples (histogram + theoretical PDF overlay)

  • the empirical vs theoretical CDF

b0, c0 = 1.7, 8.0
x = np.linspace(1.0, c0, 900)

# PDF and CDF
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=truncpareto_pdf(x, b0, c0), mode="lines", name="PDF"))
fig.update_layout(
    title=f"truncpareto PDF (b={b0:g}, c={c0:g})",
    xaxis_title="x",
    yaxis_title="density",
    width=900,
    height=380,
)
fig.show()

fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=truncpareto_cdf(x, b0, c0), mode="lines", name="CDF"))
fig.update_layout(
    title=f"truncpareto CDF (b={b0:g}, c={c0:g})",
    xaxis_title="x",
    yaxis_title="F(x)",
    width=900,
    height=380,
)
fig.show()

# Monte Carlo sample vs PDF
n = 40_000
samples = truncpareto_rvs_numpy(b0, c0, size=n, rng=rng)

fig = go.Figure()
fig.add_trace(
    go.Histogram(
        x=samples,
        histnorm="probability density",
        nbinsx=60,
        name="samples",
        opacity=0.35,
    )
)
fig.add_trace(go.Scatter(x=x, y=truncpareto_pdf(x, b0, c0), mode="lines", name="theoretical PDF"))

fig.update_layout(
    title=f"Monte Carlo samples vs PDF (n={n:,}, b={b0:g}, c={c0:g})",
    xaxis_title="x",
    yaxis_title="density",
    width=900,
    height=420,
    bargap=0.02,
)
fig.show()

# Empirical CDF vs theoretical CDF
xs = np.sort(samples)
ecdf = np.arange(1, xs.size + 1) / xs.size

fig = go.Figure()
fig.add_trace(go.Scatter(x=xs, y=ecdf, mode="lines", name="empirical CDF"))
fig.add_trace(go.Scatter(x=x, y=truncpareto_cdf(x, b0, c0), mode="lines", name="theoretical CDF"))
fig.update_layout(
    title="Empirical vs theoretical CDF",
    xaxis_title="x",
    yaxis_title="F(x)",
    width=900,
    height=420,
)
fig.show()

9) SciPy Integration (scipy.stats.truncpareto)#

SciPy implements the standardized distribution with shape parameters b and c.

Key methods:

  • pdf, logpdf, cdf, ppf

  • rvs (sampling)

  • stats(..., moments='mvsk') and entropy

  • fit (MLE)

Remember: with loc and scale, SciPy transforms \(X\) via \(Y=\mathrm{loc}+\mathrm{scale}\,X\).

b0, c0 = 1.7, 8.0
rv = stats.truncpareto(b0, c0)

# Evaluate core functions
x = np.linspace(1.0, c0, 6)
{
    "x": x,
    "pdf": rv.pdf(x),
    "cdf": rv.cdf(x),
    "ppf(0.1,0.5,0.9)": rv.ppf([0.1, 0.5, 0.9]),
}


# Sampling
data = rv.rvs(size=5000, random_state=rng)

# Fitting (fix loc=0, scale=1 for the standardized model)
b_hat, c_hat, loc_hat, scale_hat = stats.truncpareto.fit(data, floc=0, fscale=1)
(b_hat, c_hat, loc_hat, scale_hat)
(1.6649106936679934, 7.94363698958849, 0, 1)

10) Statistical Use Cases#

10.1 Hypothesis testing / goodness-of-fit#

A common workflow is:

  1. fit \((b,c)\) (often fixing loc/scale)

  2. assess fit using diagnostic plots (PDF/CDF/QQ) and tests (e.g. KS)

Important: when parameters are estimated from the same data, classical p-values from KS tests are no longer exact. A more principled approach is a parametric bootstrap under the fitted model.

10.2 Bayesian modeling#

Truncated Pareto priors are natural when you want a heavy-tailed prior but also know a hard upper bound. A classic example is an unknown “maximum” parameter \(\theta\) (e.g., the endpoint of a Uniform distribution).

10.3 Generative modeling#

If you need realistic synthetic data with a capped power-law tail (event sizes, file sizes, claim sizes), truncpareto is a simple, interpretable generator.

# 10.1 Example: (diagnostic) KS test against a fitted truncpareto
b_true, c_true = 1.8, 10.0
x = stats.truncpareto.rvs(b_true, c_true, size=2500, random_state=rng)

b_hat, c_hat, _, _ = stats.truncpareto.fit(x, floc=0, fscale=1)
ks = stats.kstest(x, "truncpareto", args=(b_hat, c_hat))

{
    "true (b, c)": (b_true, c_true),
    "fitted (b, c)": (float(b_hat), float(c_hat)),
    "KS statistic": float(ks.statistic),
    "KS p-value (not exact after fit)": float(ks.pvalue),
}
{'true (b, c)': (1.8, 10.0),
 'fitted (b, c)': (1.829190701746747, 9.973639795656418),
 'KS statistic': 0.012295004694353129,
 'KS p-value (not exact after fit)': 0.8395284823777183}
# 10.2 Example: truncated Pareto prior for an unknown Uniform endpoint
# Data model: Y_i | theta ~ Uniform(0, theta)
# Prior: theta ~ TruncPareto(b0, c=U/L) scaled to [L, U]

theta_true = 8.0
n = 40
y = rng.uniform(0.0, theta_true, size=n)
y_max = float(np.max(y))

L, U = 1.0, 20.0
b_prior = 2.0

prior = stats.truncpareto(b_prior, U / L, loc=0.0, scale=L)

# Posterior under the Uniform likelihood:
# p(theta | y) ∝ theta^{-n} * theta^{-(b_prior+1)} = theta^{-(b_prior+n+1)} on [max(L, y_max), U]
L_post = max(L, y_max)
b_post = b_prior + n
post = stats.truncpareto(b_post, U / L_post, loc=0.0, scale=L_post)

grid = np.linspace(L, U, 900)
fig = go.Figure()
fig.add_trace(go.Scatter(x=grid, y=prior.pdf(grid), mode="lines", name=f"prior (b={b_prior:g})"))
fig.add_trace(go.Scatter(x=grid, y=post.pdf(grid), mode="lines", name=f"posterior (b={b_post:g})", line=dict(width=3)))
fig.add_vline(x=y_max, line_dash="dash", line_color="gray", annotation_text="max(y)")

fig.update_layout(
    title=f"Uniform endpoint inference (n={n}, theta_true={theta_true:g})",
    xaxis_title="theta",
    yaxis_title="density",
    width=900,
    height=420,
)
fig.show()

{
    "y_max": y_max,
    "posterior mean": float(post.mean()),
    "95% credible interval": tuple(map(float, post.ppf([0.025, 0.975]))),
}
{'y_max': 7.96832887580269,
 'posterior mean': 8.162678360578365,
 '95% credible interval': (7.9731336719402, 8.699845415743587)}
# 10.3 Example: generative modeling + CCDF (survival) on log-log axes
b0, c0 = 1.3, 50.0
n = 80_000
s = truncpareto_rvs_numpy(b0, c0, size=n, rng=rng)
s_sorted = np.sort(s)

# Empirical survival S(x) = P(X > x)
surv_emp = 1.0 - (np.arange(1, n + 1) / n)

x_grid = np.logspace(0, np.log10(c0), 500)
surv_theory = 1.0 - truncpareto_cdf(x_grid, b0, c0)

fig = go.Figure()
fig.add_trace(go.Scatter(x=s_sorted, y=surv_emp, mode="lines", name="empirical survival"))
fig.add_trace(go.Scatter(x=x_grid, y=surv_theory, mode="lines", name="theoretical survival"))

fig.update_layout(
    title=f"Survival function on log-log scale (b={b0:g}, c={c0:g})",
    xaxis_title="x",
    yaxis_title="P(X > x)",
    xaxis_type="log",
    yaxis_type="log",
    width=900,
    height=420,
)
fig.show()

11) Pitfalls#

  • Invalid parameters: require \(b>0\) and \(c>1\).

  • Support mismatches: the standardized model assumes \(x\in[1,c]\). If your natural lower bound is \(x_{\min}\ne 1\), use a scale transform.

  • Boundary behavior when fitting:

    • With \(c\) unknown, the MLE is the sample maximum (\(\hat c=\max x_i\)), which is a boundary estimate.

    • If you know the physical cutoff, treat \(c\) as fixed rather than estimated.

  • Numerical issues when \(c\approx 1\): \(1-c^{-b}\) can be tiny; prefer logpdf and expm1/log1p-style computations.

  • Interpreting power laws: many distributions can look linear on log-log plots over a short range; validate with diagnostics and model comparison.

12) Summary#

  • truncpareto is a continuous bounded power-law on \([1,c]\) with parameters \(b>0\) and \(c>1\).

  • PDF: \(f(x)=\dfrac{b}{1-c^{-b}}x^{-(b+1)}\) on \([1,c]\); CDF: \(F(x)=\dfrac{1-x^{-b}}{1-c^{-b}}\).

  • All moments exist (bounded support); raw moments have a simple closed form.

  • Sampling is easy via the inverse CDF: \(X=(1-U(1-c^{-b}))^{-1/b}\).

  • SciPy’s scipy.stats.truncpareto provides pdf/cdf/rvs/fit, plus moments and entropy.